perm filename NLM[AM,DBL]2 blob
sn#500054 filedate 1980-04-09 generic text, type C, neo UTF8
COMMENT ā VALID 00004 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 JEANNE: THIS IS THE LONG REPORT:
C00045 00003 JEANNE: THIS IS THE SHORT SUMMARY:
C00059 00004 BIBLIOGRAPHY
C00070 ENDMK
Cā;
JEANNE: THIS IS THE LONG REPORT:
PROJECT 2: WORKBENCH FOR KNOWLEDGE REPRESENTATION
Investigator: Douglas B. Lenat
1. Objectives
The major objective is to transfer artificial intelligence expertise to
various applications, primarily medical, by creating large software
packages which can be shared among these different tasks. The set of
tools constitutes a workbench for knowledge engineers, thereby
facilitating the rapid construction of expert knowledge-based programs and
the advancement of core research (qv) of the project.
This year, our efforts have focused on applying our existing packages
(AGE, EMYCIN, UNITS) to new medical tasks, and on designing and
constructing an entirely new tool, RLL (since this is primarily core
research, please see that section for information about RLL). The EMYCIN
package was applied to the problem of diagnosing various blood coagulation
disorders; of some interest is the relatively small amount of time
required for this implementation (a few weeks). The UNITS package has
been augmented by a more sophisticated package for automatically
explaining its reasoning to the watching human user. It has been used
extensively by molecular biologists. The AGE system has generated so much
interest that a week long seminar was conducted, to instruct a dozen
researchers in its use.
2. Studies and Results
2.1 Using EMYCIN for Diagnosis of Blood Coagulation Disorders
The development of a new medical consultant has allowed us to test of the
current knowledge acquisition facilities of EMCYIN. The system, named
CLOT, is designed to diagnose various blood coagulation disorders in a
patient. It requests details about a current episode of bleeding, various
facts from the patient's medical history, and the results of a battery of
coagulation screening tests. From this information CLOT infers the
presence and type of the patient's coagulation defect (if any) and then
attempts to isolate the specific enyzmatic deficiency or platelet defect.
These diagnoses can be used by a physician to estimate the severity and
cause of a particular episode of bleeding, evaluate the effects of various
anti-coagulation therapies on a patient, and estimate the pre-operative
risk of a patient having serious bleeding problems during surgery.
Constructed with the help of David Goldman, a medical student at the
University of Missouri, CLOT was implemented following approximately 10
hours of discussion about the contents of the knowledge base and was then
entered and debugged using EMYCIN in another 10 hours. The knowledge base
contains approximately 60 rules and a comparable number of clinical
parameters. It must be emphasized that this system is a preliminary
version of a more substantial consultant and we make no claims about the
current version's level of clinical expertise. However, we feel that the
structure of the knowledge base, in terms of the tasks it performs and the
manner in which it pursues these tasks, reflects the major factors and
reasoning strategies required for expert performence in this domain.
We found that the knowledge acquisition tools of EMYCIN had substantially
improved since the construction of SACON, the last EMCYIN consultant
program. We found that these facilities now performed a large amount of
useful checking and default specification regarding the form and content
of a knowledge base. In particular, there is a new facility which
provides aid when acquiring and specifying the context tree for a
consultant, eliminating a substantial amount of the tedium required to set
up the multitude of associated data structures for each context and ensure
their consistency. In addition, the facility for acquiring clinical
parameters of a context now does a significant amount of value checking on
the basis of a simple parameter classification scheme which we also found
very helpful.
We made use of the new ARL (Abbreviated Rule Language) facility when
acquiring the rules for CLOT. Designed to capitalize on the
stereotypically terse expression of rule clauses by experts, ARL reduces
the amount of typing time and, again, ensures the correct forms are used
for specifying the premise functions. In addition to ARL, the new rule
clause subsumption checker also proved very useful during the
specification of the larger rule sets in the system. This checker
analyzes each new rule for any syntactic subsumptions or equivalences with
the premise clauses of other rules. We found that, for the larger rule
sets, the premise clauses were sometimes specified with identical premise
clauses, due either to a typing mistake or an actual error in the rule
specification. The checker detected these inconsistencies and provided a
graceful method of dealing with them.
In addition to testing the existing facilities for acquiring the knowledge
required for a new consultant, we have used this experience to investigate
other potential methods for improving these facilities. In fact, the
primary motivation for constructing this consultant was to explore the use
of more generic knowledge about performing diagnosis to guide and focus
the acquisition of a diagnosistic consultant for a specific domain. The
major hypothesis of this exploration is that there are concepts about
doing diagnosis in general, that are independent of a particular
application and that can be used to suggest tasks and strategies which an
expert might consider when formulating his knowledge about a specific
diagnostic task. We recorded the sessions in which CLOT was specified and
derived an outline of the major topics that were introduced and discussed
during the specification. In spite of the systems' relatively small size,
the CLOT example demonstrated that these generic concepts were useful,
that they did serve to focus and guide the discussions, and that they
could be used to suggest new topics that the expert felt he should
consider in this domain. We are currently implementing a knowledge
acquisition system which makes use of these generic, task-specific
concepts to interact with the expert.
2.2 Augmenting the User Interface to UNITS
The Units system is a powerful and flexible frame-based knowledge
representation package, developed in the context of experiment design in
molecular genetics. It has since been employed in a variety of other uses
in that domain, and is being tried on applications outside genetics as
well.
The primary purpose of the Units system is to facilitate the construction
of large knowledge bases by the individual domain experts. Because of
this, the front-end of the system must be friendly to non-
computer-proficient users. Many different types of information are
included in a molecular genetics knowledge base; these different
information types are described by using small editing systems specialized
to the particular type. For example, much of the knowledge about nucleic
acid structures is commonly described in the form of maps of those
structures. The maps show the location of genes and control regions, and
also the location of the cutting sites of restriction endonucleases.
Significant effort went into designing a map editor that provided for both
facile input and output. [See Appendix ]. This map editor, in conjunction
with the remainder of the genetics knowledge base, is proving an
invaluable tool for the analysis of nucleic acid structures. Users
typically enter raw sequence data about a new structure and then, using
English-like rules, create many different types of maps relating to the
structure. The time to create these maps has been reduced by over an
order of magnitude from the previous best method, and their accuracy and
readability has increased greatly.
This year, the UNITS group has reorganized the UNITS files to include
Masterscope information, to facilitate automatic explanation of what the
program is doing. Some minor internal cleanup and simplification followed.
We sent the files to Dr. Reid Smith at DREA in Canada and he made further
simplifications, including removal of the status files and creation of a
new reference manual. Tapes have been requested from as far away as
Japan, as well as domestic sites (e.g., Dr. Nancy Martin at UNM.)
The map-making and sequence analysis function of the Units system, a side
benefit of the knowledge base construction effort, is now being used by
many groups at Stanford among them those of Professors Paul Berg, Stanley
Cohen, Laurence Kedes, and Douglas Brutlag. The utility of such
representation work underscores the need for work on the human interface
in knowledge representation systems, as well as the realization that
different types of information, even if internally represented in similar
ways, should be externally presented in a custom-designed manner. It also
shows the importance of providing user-accessible control over the
manipulation of knowledge, in this case the English-like rule language.
The following is a statement from Dr. Brutlag: "I have discovered the
beauty of focus and direction in kb (knowledge base) making. By limiting
our kb to a very small set of problems we now have an extremely useful
knowledge base. With the help of Peter [Friedland], who has provided some
of the best editors and datatype handlers that I've seen for molecular
type of information, we have made a useful knowledge base in that it not
only allows us to store and edit very complex information and rules, but
it lets us process it with powerful sets of production rules that are
written practically in biochemease. I currently use the kb and our set of
rules in everyday use in the laboratory. I have also been invited to
present a talk on our work in Heidelberg Germany this month to
biochemists. Maybe I can stir up some interest in AI among this group.
Anyway, I still am having some problems in the data acquisition phase. I
can organize data and think of production rules about 10 times faster than
I can enter it into the current editors. My time would be better spent if
I could enter rules and data in a more free format like using a standard
text editor. Eventually we might have a rule parser that can read a set
of rules written in a freer style directly from a text file."
As is clear from Brutlag's note, the current MOLGEN UNITS package is
already a useful tool, and its usage points out areas in which further
development should proceed.
A case study of the use of UNITS is presented as an appendix to this
report.
2.3 Time-Oriented Knowledge Representation Package
One direction for the workbench section of this project is the development
of tools for representing knowledge about time-varying situations. This
work has concentrating on the process of turning one application program
into a more general representation package. The application program
(called VM) was designed for real-time interpretation of measurement data
from patients in the intensive care unit using a rule-based approach.
This program was designed specifically around the problem of aggregating
many separate determinations of patient measurements taken a period of
time into a historical record of changes in the patient's status.
The first step in the development of a useful package from an existing
program is the identification of the particular simplifying assumptions of
the application area. VM took advantage of the fact that the ambiguity of
the domain was primarily due to changes in therapeutic context, rather
than from poor signal sources, or a very imprecise theory of how to
interpret the signal data. The representation of these therapeutic
contexts is limited by the assumption that only one of these contexts can
be used to describe the current setting. The current design for the
time-oriented package allows for a more flexible representation that will
allow for more ambiguity (from all sources) but maintain a simple control
structure (forward-driven processing) for manipulating the representation.
This is accomplished by attaching measures of belief/certainty to
aggregates of conclusions that have been derived over a period of time.
2.4 A Workshop to Disseminate the AGE System
In the AGE system an attempt has been made to isolate inference, control,
and representation techniques from a few previous knowledge-based systems,
and reprogram them for domain independence. AGE is a library of
building-block programs (called "components") combined with an interface
that assists the user in the design and construction of knowledge-based
programs. It is hoped that AGE will speed up the process of building
knowledge-based programs and facilitate the dissemination of AI techniques
by: (1) packaging common AI software tools so that they do not need to be
reprogrammed for every problem; and (2) helping people who are not
knowledge-engineering specialists write knowledge-based programs.
The components in AGE have been carefully selected and modularly
programmed to be useable in combinations. For those users not familiar
enough to experiment with combining the components, AGE currently provides
the user two predefined configuration of components--each configuration is
called a "framework". One framework uses the concepts of a globally
accessible data structure called a "blackboard", and independent sources
of knowledge which cooperate to form hypotheses. The Blackboard model has
been modified to allow flexibility in representation, selection, and
utilization of knowledge. The other framework (called Backchain) is for
building programs that use backward-chained production rules as its
primary method of generating inferences.
To support the user in the selection, specification, and use of the
components, AGE is currently organized around several major interacting
subsystems: BROWSE (guides the user in browsing through the hierarchical
descriptions of components and their use); DESIGN (guides the user in the
design and construction of his program through the use of predefined
configuration of components); ACQUISITION (acquiring task-specific
information); INTERPRETER (modules that help the user run and debug his
program); EXPLANATION (replay AGE's execution steps and list the
justifications for the actions).
AGE-1 has been available to limited number of users on experimental basis
since October, 1979. A more public version is scheduled to be available
in early July, 1980.
A three-day workshop was conducted on the week of March 4, 1980 for a
limited number of people who had requested access to AGE. Without
exception the attendees represented organizations that wished to build
knowledge-based programs, but could not do so because of lack of qualified
AI programmers. The aim of the workshop was to familiarize the user with
AGE (many of them needing help in learning Interlisp language) and to have
a running program by the end of the workshop. Each attendee was required
to bring his own problem to be implemented. The names of the organization
that sent attendees to the workshop, and brief description of the problems
they are interested in implementing on AGE are listed below:
Information Science Group, University of Missouri-Columbia:
Interpretation of test results for determining the cause of blood
coagulation problems in patient with excessive bleeding. If the
interpretation problem can be successfully implemented, they will go on to
implement a program that recommend anticoagulant therapy.
Institute of Medical Electronics, University of Tokyo: Diagnosis of
cardiovascular disease using diverse data and knowledge, and therapy
recommendation with re-evaluation diagnosis. In general, this group is
interested in building programs that serve as research tools rather than
as applied clinical tools.
Department of Psychology, University of Colorado: This group is using the
Blackboard framework in AGE to build a psychological model of prose
comprehension. They had been using AGE since a few months prior to the
workshop.
Oak Ridge National Laboratory: Interpretation of physical
signals--non-medical application.
Schlumberger-Doll Research Center: Interpretation of physical
signals--non-medical application.
3. Goals for the Coming Year
The application of EMYCIN to medical tasks should continue unabated this
year. In particular, the CLOT group is currently implementing a knowledge
acquisition system which makes use of the generic, task-specific concepts
(the ones isolated this year) to interact with the medical expert, to
assist him in staying "focussed" during the diagnostic process. The
self-explanation capabilities added to UNITS will be tested in several
domains this coming year, and may lead to better ideas about explanation,
and thence an even better facility for EMYCIN.
Dr. W. F. Bodmer of the British Imperial Cancer Research Fund Laboratory
has just been provided with access to and instruction in the use of the
MOLGEN UNITS system. He intends to distribute it to his staff scientists
to use in their routine work. The map-making and sequence analysis
facilities which were developed to facilitate the construction of large
knowledge bases rapidly, are of chief interest to him. As Dr. Douglas
Brutlag requests in his above quoted letter, one thrust of the UNITS
development work will be to make the editors "faster". This does not mean
run faster, since they take very little computer time even now; rather it
means a kind of human engineering, a tailoring of the editors to the way
in which scientists want to enter their knowledge.
Besides the current focus on increasing system efficiency (ie. garbage
collection), we are revising the internal UNITS representations to provide
further simplifications and easier use. A new paging algorithm is also
under study.
The basic UNITS access files are now being used in AGE as an auxillary
data base. We shall experiment to locate any synergy which derives from
this combination of two of our most powerful tools.
The work on time-oriented reasoning is continuing and being generalized in
such a way as to automate the generation of multiple expectations for the
appropriate ranges for incoming data based on uncertainty in the
determination of the current context. These changes will provide for a
larger number of potential applications of the time-base representation
system.
Although there has been some use of AGE, there needs to be an extensive
test of its capabilities. We intend to implement a relatively complex
application problem ourselves that would serve as this test. At the same
time we will use feedback from outside users to improve the system.
For AGE-2, our plan is to improve the user interface so that non-
specialist in knowledge-based programs can use the system without having
to attend a workshop. It includes extensive research into the problem of
determining appropriate knowledge representation and processing for given
tasks. This involves characterizing problems in a variety of ways, and
matching the characterizations with those in the various framework still
to be implemented.
RLL (see the section on core research for details) is now sufficiently
fully implemented to be chosen for development; several groups have
expressed interest in using it as a tool to facilitate the construction of
expert programs for their tasks; these include Stanford's Dr. Harold Brown
(VLSI layout task), Rand's Dr. Frederick Hayes-Roth (Analogy-finding
task), and Stanford's Dr. Douglas Brutlag (Evolutionary genetics pathways
task). Lenat (RLL) and Stefik (Molgen UNITS) have been asked to be the
co-presenters of a special tutorial on tools for knowledge engineering at
this summer's AAAI conference (American Association for Artifical
Intelligence). Lenat has also organized (in conjunction with Hayes-Roth
and Waterman of Rand) a workshop on Expert Systems, for August, 1980.
These two activities are expected to lead to a new synthesis of knowledge
engineering, to appear as both a survey article and a book.
As this program's charter indicates, such applications (both medical and
nonmedical) lead to improved knowledge engineering tools, which in turn
facilitate the creation of future expert systems. The specific plans
mentioned above illustrate this theme.
APPENDICES
Please include Appendix D from the MOLGEN proposal.
If Peter Friedland can't get you (or Ed) one electronically,
then you'll have to Xerox it from the document itself.
JEANNE: THIS IS THE SHORT SUMMARY:
PROJECT 2: WORKBENCH FOR KNOWLEDGE REPRESENTATION
Investigator: Douglas B. Lenat
1. Objectives
The major objective is to transfer artificial intelligence expertise to
various applications, primarily medical, by creating large software
packages which can be shared among these different tasks. The set of
tools constitutes a workbench for knowledge engineers, thereby
facilitating the rapid construction of expert knowledge-based programs and
the advancement of core research. This year, our efforts have focused on
applying our existing packages (AGE, EMYCIN, UNITS) to new medical tasks,
on generalizing our VM package for time-oriented reasoning, and on
constructing an entirely new tool, RLL.
2. Studies and Results
The development of a new medical consultant program, CLOT, has allowed us
to test of the current knowledge acquisition facilities of EMCYIN. It
diagnoses various blood coagulation disorders in a patient, given his
medical history and current coagulation screening tests. CLOT attempts to
isolate the specific enyzmatic deficiency or platelet defect. These
diagnoses can be used by a physician to estimate the severity and cause of
a particular episode of bleeding, evaluate the effects of various
anti-coagulation therapies on a patient, and estimate the pre-operative
risk of a patient having serious bleeding problems during surgery. CLOT
was implemented following approximately 10 hours of discussion about the
contents of the knowledge base and was then entered and debugged using
EMYCIN in another 10 hours. The knowledge base contains approximately 60
rules and a comparable number of clinical parameters. One new feature of
EMYCIN useful in the rapid implementation of CLOT was the automatic
checker (and default value specifier) for values of clinical parameters.
The major hypothesis of this exploration is that there are concepts about
doing diagnosis in general, that are independent of a particular
application and that can be used to suggest relevant strategies
opportunely.
The UNITS system is designed to facilitate the construction of large
knowledge bases by individual domain experts. Hence the front-end of the
system must be friendly to non-computer-proficient users. For its use by
molecular geneticists, we developed a special-purpose editor for
manipulating maps of nucleic acid structures. Users typically enter raw
sequence data about a new structure and then, using English-like rules,
create many different types of maps relating to the structure. The time
to create these maps has been reduced by over an order of magnitude from
our previous best method, and it is now used here by Professors Paul Berg,
Stanley Cohen, Laurence Kedes, and Douglas Brutlag.
We sorely need tools for dealing with time-varying situations. VM, one
application program, took advantage of the fact that the ambiguity of the
domain was primarily due to changes in therapeutic context, rather than
from poor signal sources, or a very imprecise theory of how to interpret
the signal data. The representation of these therapeutic contexts is
limited by the assumption that only one of these contexts can be used to
describe the current setting. A new general time-oriented package allows
for a more flexible representation, permitting more ambiguity (from all
sources) but maintaining a simple control structure (forward-driven
processing) for manipulating the representation. This is accomplished by
attaching measures of belief/certainty to aggregates of conclusions that
have been derived over a period of time.
In the AGE system an attempt has been made to isolate inference, control,
and representation techniques from a few previous knowledge-based systems,
and reprogram them for domain independence. AGE-1 has been available to
limited number of users on experimental basis since October, 1979. A more
public version is scheduled to be in early July, 1980. A three-day
workshop was conducted on the week of March 4, 1980 for a limited number
of people who had requested access to AGE. Without exception the
attendees represented organizations that wished to build knowledge-based
programs, but could not do so because of lack of qualified AI programmers.
The aim of the workshop was to familiarize the user with AGE (many of them
needing help in learning Interlisp language) and to have a running program
by the end of the workshop. Each attendee was required to bring his own
problem to be implemented. Two non-medical applications groups were
represented, and several medical ones: Information Science Group,
University of Missouri-Columbia: Interpretation of test results for
determining the cause of blood coagulation problems in patient with
excessive bleeding. Institute of Medical Electronics, University of
Tokyo: Diagnosis of cardiovascular disease. Department of Psychology,
University of Colorado: Building a psychological model of prose
comprehension.
3. Goals for the Coming Year
The CLOT group is currently implementing a knowledge acquisition system
which makes use of the generic, task-specific concepts (the ones isolated
this year) to interact with the medical expert, to assist him in staying
"focussed" during the diagnostic process. The self-explanation
capabilities added to UNITS will be tested in several domains this coming
year, and may lead to better ideas about explanation, and thence an even
better facility for EMYCIN. Dr. W. F. Bodmer of the British Imperial
Cancer Research Fund Laboratory is distributing the MOLGEN UNITS system to
his staff scientists to use in their routine work. At the request of Dr.
Douglas Brutlag, the UNITS editors will be made "faster" (not running
faster, but rather a tailoring of the editors to the way in which
scientists want to enter their knowledge.) The work on time-oriented
reasoning is continuing, by trying to automate the generation of multiple
expectations for the incoming data. That is, what is an appropriate
ranges for each parameter, and how can these be determined automatically?
For AGE-2, our plan is to improve the user interface so that non-
specialist can use the system without having to attend a workshop.
Additionally, both RLL and AGE are candidates for several applications
this year. As this program's charter indicates, such applications (both
medical and nonmedical) lead to improved knowledge engineering tools,
which in turn facilitate the creation of future expert systems. The
specific plans mentioned above for each project illustrate this theme.
BIBLIOGRAPHY
.group
(157) <<HPP-79-3>
Nelleke Aiello, H. Penny Nii
"Building A Knowledge-Based System With AGE,"
submitted to Sixth IJCAI79, February 1979.
.skip 2
.apart
.group
(158) <<HPP-79-4>
H. Penny Nii, Nelleke Aiello
"AGE (Attempt to Generalize): A Knowledge-Based Program
for Building Knowledge-Based Programs," submitted to
Sixth IJCAI79, February 1979.
.skip 2
.apart
.group
(159) <<HPP-79-5> (working paper)
L. Fagan, J. Kunz, E. Feigenbaum, CSD Stanford University
J.J. Osborn from PMC, San Francisco
"Knowledge Engineering for Dynamic Clinical Settings:
Giving Advice In The Intensive Care Unit," submitted to
Sixth IJCAI79, February 1979.
.skip 2
.apart
.group
(161) <<HPP-79-7> (working paper)
William van Melle,
"A Domain-independent Production-rule System For Consultation
Programs," submitted to Sixth IJCAI79, February 1979.
.skip 2
.apart
.group
(162) <<HPP-79-8> (working paper)
Alain Bonnet,
"BAOBAB-2 Understanding Medical Jargon As If It Were A
Natural Language," submitted to Sixth IJCAI79, February 1979.
.skip 2
.apart
.group
(163) <<HPP-79-9> (working paper)
William J. Clancey,
"Dialogue Management For Rule-based Tutorials," submitted to
Sixth IJCAI79, February 1979.
.skip 2
.apart
.group
(165) <<HPP-79-11> (working paper)
A. Barr, W. Clancey, J. Bennett
"Transfer of Expertise: A Theme of AI Research," April 1979.
.skip 2
.apart
.group
(167) <<HPP-79-13>
James S. Bennett, Robert S. Engelmore
"SACON: A Knowledge-Based Consultant For Structural Analysis,"
submitted to Sixth IJCAI79, April 1979.
.skip 2
.apart
.group
(169) <<HPP-79-15> (working paper)
Douglas B. Lenat
"Cognitive Economy,"
June 1979. Submitted to Sixth IJCAI79, August 1979.
.skip 2
.apart
.group
(172) <<HPP-79-18>
Larry M. Fagan, Edward H. Shortliffe, Bruce G. Buchanan
"Computer-Based Medical Decision Making: From MYCIN to VM,"
to appear in Automedica, July 1979.
.skip 2
.apart
.group
(174) <<HPP-79-20>
Edward H. Shortliffe, Bruce G. Buchanan, Edward Feigenbaum,
"Knowledge Engineering For Infectious Disease Therapy Selection"
in Proceedings of the IEEE, Vol. 67, No. 9, September 1979.
.skip 2
.apart
.group
(176)<<HPP-79-22>
STAN-CS-79-756
James S. Bennett, Bruce G. Buchanan, Paul R. Cohen
"Applications-oriented AI Research: Sciences and Mathematics"
to appear in Handbook of Artificial Intelligence, August 1979.
.skip 2
.apart
.group
(177)<<HPP-79-23>
STAN-CS-79-757
Victor Ciesielski, James S. Bennett and Paul R. Cohen,
"Applications-oriented AI Research: Medicine"
to appear in Handbook of Artificial Intelligence, August 1979.
.skip 2
.apart
.group
(180) <<HPP-79-26>
Edward H. Shortliffe, M.D., Ph.D.
"Clinical Knowledge Engineering: The MYCIN Project"
appeared in "Inference & Decision Making Processes in
Medicine," Proceedings of First International Workshop
on Methodologies for Disease Control, August 25, 1979
Tokyo Japan.
.skip 2
.apart
.group
(183) <<HPP-79-29>
Peter E. Friedland
"Knowledge-Based Experiment Design In Molecular Genetics,"
Ph.D. dissertation, Stanford University, October 1979.
.skip 2
.apart
.group
(157) <<HPP-80-2>
STAN-CS-80-784
Mark Jeffrey Stefik
"Planning With Constraints," Ph.D. dissertation,
Stanford University, January 1980.
.skip 2
.apart